26 research outputs found

    On Separate Normalization in Self-supervised Transformers

    Full text link
    Self-supervised training methods for transformers have demonstrated remarkable performance across various domains. Previous transformer-based models, such as masked autoencoders (MAE), typically utilize a single normalization layer for both the [CLS] symbol and the tokens. We propose in this paper a simple modification that employs separate normalization layers for the tokens and the [CLS] symbol to better capture their distinct characteristics and enhance downstream task performance. Our method aims to alleviate the potential negative effects of using the same normalization statistics for both token types, which may not be optimally aligned with their individual roles. We empirically show that by utilizing a separate normalization layer, the [CLS] embeddings can better encode the global contextual information and are distributed more uniformly in its anisotropic space. When replacing the conventional normalization layer with the two separate layers, we observe an average 2.7% performance improvement over the image, natural language, and graph domains.Comment: NIPS 202

    A Systematic Survey of Chemical Pre-trained Models

    Full text link
    Deep learning has achieved remarkable success in learning representations for molecules, which is crucial for various biochemical applications, ranging from property prediction to drug design. However, training Deep Neural Networks (DNNs) from scratch often requires abundant labeled molecules, which are expensive to acquire in the real world. To alleviate this issue, tremendous efforts have been devoted to Molecular Pre-trained Models (CPMs), where DNNs are pre-trained using large-scale unlabeled molecular databases and then fine-tuned over specific downstream tasks. Despite the prosperity, there lacks a systematic review of this fast-growing field. In this paper, we present the first survey that summarizes the current progress of CPMs. We first highlight the limitations of training molecular representation models from scratch to motivate CPM studies. Next, we systematically review recent advances on this topic from several key perspectives, including molecular descriptors, encoder architectures, pre-training strategies, and applications. We also highlight the challenges and promising avenues for future research, providing a useful resource for both machine learning and scientific communities.Comment: IJCAI 2023, Survey Trac

    Accurate transition state generation with an object-aware equivariant elementary reaction diffusion model

    Full text link
    Transition state (TS) search is key in chemistry for elucidating reaction mechanisms and exploring reaction networks. The search for accurate 3D TS structures, however, requires numerous computationally intensive quantum chemistry calculations due to the complexity of potential energy surfaces. Here, we developed an object-aware SE(3) equivariant diffusion model that satisfies all physical symmetries and constraints for generating pairs of structures, i.e., reactant, TS, and product, in an elementary reaction. Provided reactant and product, this model generates a TS structure in seconds instead of the hours required when performing quantum chemistry-based optimizations. The generated TS structures achieve an average error of 0.13 A root mean square deviation compared to true TS. With a confidence scoring model for uncertainty quantification, we approach an accuracy required for reaction rate estimation (2.6 kcal/mol) by only performing quantum chemistry-based optimizations on 14% of the most challenging reactions. We envision the proposed approach to be useful in constructing and pruning large reaction networks with unknown mechanisms

    Improving Molecular Pretraining with Complementary Featurizations

    Full text link
    Molecular pretraining, which learns molecular representations over massive unlabeled data, has become a prominent paradigm to solve a variety of tasks in computational chemistry and drug discovery. Recently, prosperous progress has been made in molecular pretraining with different molecular featurizations, including 1D SMILES strings, 2D graphs, and 3D geometries. However, the role of molecular featurizations with their corresponding neural architectures in molecular pretraining remains largely unexamined. In this paper, through two case studies -- chirality classification and aromatic ring counting -- we first demonstrate that different featurization techniques convey chemical information differently. In light of this observation, we propose a simple and effective MOlecular pretraining framework with COmplementary featurizations (MOCO). MOCO comprehensively leverages multiple featurizations that complement each other and outperforms existing state-of-the-art models that solely relies on one or two featurizations on a wide range of molecular property prediction tasks.Comment: 24 pages, work in progres

    A new perspective on building efficient and expressive 3D equivariant graph neural networks

    Full text link
    Geometric deep learning enables the encoding of physical symmetries in modeling 3D objects. Despite rapid progress in encoding 3D symmetries into Graph Neural Networks (GNNs), a comprehensive evaluation of the expressiveness of these networks through a local-to-global analysis lacks today. In this paper, we propose a local hierarchy of 3D isomorphism to evaluate the expressive power of equivariant GNNs and investigate the process of representing global geometric information from local patches. Our work leads to two crucial modules for designing expressive and efficient geometric GNNs; namely local substructure encoding (LSE) and frame transition encoding (FTE). To demonstrate the applicability of our theory, we propose LEFTNet which effectively implements these modules and achieves state-of-the-art performance on both scalar-valued and vector-valued molecular property prediction tasks. We further point out the design space for future developments of equivariant graph neural networks. Our codes are available at \url{https://github.com/yuanqidu/LeftNet}

    MUBen: Benchmarking the Uncertainty of Pre-Trained Models for Molecular Property Prediction

    Full text link
    Large Transformer models pre-trained on massive unlabeled molecular data have shown great success in predicting molecular properties. However, these models can be prone to overfitting during fine-tuning, resulting in over-confident predictions on test data that fall outside of the training distribution. To address this issue, uncertainty quantification (UQ) methods can be used to improve the models' calibration of predictions. Although many UQ approaches exist, not all of them lead to improved performance. While some studies have used UQ to improve molecular pre-trained models, the process of selecting suitable backbone and UQ methods for reliable molecular uncertainty estimation remains underexplored. To address this gap, we present MUBen, which evaluates different combinations of backbone and UQ models to quantify their performance for both property prediction and uncertainty estimation. By fine-tuning various backbone molecular representation models using different molecular descriptors as inputs with UQ methods from different categories, we critically assess the influence of architectural decisions and training strategies. Our study offers insights for selecting UQ and backbone models, which can facilitate research on uncertainty-critical applications in fields such as materials science and drug discovery

    M2^2Hub: Unlocking the Potential of Machine Learning for Materials Discovery

    Full text link
    We introduce M2^2Hub, a toolkit for advancing machine learning in materials discovery. Machine learning has achieved remarkable progress in modeling molecular structures, especially biomolecules for drug discovery. However, the development of machine learning approaches for modeling materials structures lag behind, which is partly due to the lack of an integrated platform that enables access to diverse tasks for materials discovery. To bridge this gap, M2^2Hub will enable easy access to materials discovery tasks, datasets, machine learning methods, evaluations, and benchmark results that cover the entire workflow. Specifically, the first release of M2^2Hub focuses on three key stages in materials discovery: virtual screening, inverse design, and molecular simulation, including 9 datasets that covers 6 types of materials with 56 tasks across 8 types of material properties. We further provide 2 synthetic datasets for the purpose of generative tasks on materials. In addition to random data splits, we also provide 3 additional data partitions to reflect the real-world materials discovery scenarios. State-of-the-art machine learning methods (including those are suitable for materials structures but never compared in the literature) are benchmarked on representative tasks. Our codes and library are publicly available at https://github.com/yuanqidu/M2Hub

    A meaningful exploration of ofatumumab in refractory NMOSD: a case report

    Get PDF
    ObjectiveTo report the case of a patient with refractory neuromyelitis optica spectrum disorder (NMOSD), who, despite showing poor response or intolerance to multiple immunosuppressants, was successfully treated with Ofatumumab.Case presentationA 42-year-old female was diagnosed with NMOSD in the first episode of the disease. Despite treatment with intravenous methylprednisolone, immunoglobulin, rituximab and immunoadsorption, together with oral steroids, azathioprine, mycophenolate mofetil and tacrolimus, she underwent various adverse events, such as abnormal liver function, repeated infections, fever, rashes, hemorrhagic shock, etc., and experienced five relapses over the ensuing four years. Finally, clinicians decided to initiate Ofatumumab to control the disease. The patient received 9 doses of Ofatumumab over the next 10 months at customized intervals. Her symptoms were stable and there was no recurrence or any adverse events.ConclusionOfatumumab might serve as an effective and safe alternative for NMOSD patients who are resistant to other current immunotherapies

    Smoking Cessation With 20 Hz Repetitive Transcranial Magnetic Stimulation (rTMS) Applied to Two Brain Regions: A Pilot Study

    Get PDF
    Chronic smoking impairs brain functions in the prefrontal cortex and the projecting meso-cortical limbic system. The purpose of this pilot study is to examine whether modulating the frontal brain activity using high-frequency repetitive transcranial magnetic stimulation (rTMS) can improve smoking cessation and to explore the changing pattern of the brain activity after treatment. Fourteen treatment-seeking smokers were offered a program involving 10 days of rTMS treatment with a follow-up for another 25 days. A frequency of 20 Hz rTMS was sequentially applied on the left dorso-lateral prefrontal cortex (DLPFC) and the superior medial frontal cortex (SMFC). The carbon monoxide (CO) level, withdrawal, craving scales, and neuroimaging data were collected. Ten smokers completed the entire treatment program, and 90% of them did not smoke during the 25-day follow-up time. A significant smoking craving reduction and resting brain activity reduction measured by the cerebral blood flow (CBF) and brain entropy (BEN) were observed after 10 days of 20 Hz rTMS treatments compared to the baseline. Although limited by sample size, these pilot findings definitely showed a high potential of multiple-target high-frequency rTMS in smoking cessation and the utility of fMRI for objectively assessing the treatment effects
    corecore